q learning algorithm in machine learning